Training complex neural architectures made a small revolution in the machine learning world. In IDA, we extend neural networks with capabilities known from the symbolic AI, such as (logical) reasoning and learning from complex, interconnected (relational) data. We also propose advanced architectures for gene expression inference that allow for efficient utilization of hardware as they have better performance given the same hardware constraints compared to the state-of-the-art architectures. We leverage deep learning in our application domains such as sports analytics and bioinformatics.
Using well grounded theoretical principles, we formulate algorithms for optimizing challenging problems arising from a variety of settings. In particular, we develop continuous constrained optimization algorithms as used in engineering design, online pathfollowing procedures for use in process control, and have a particular focus towards stochastic (random) optimization with a focus towards machine learning. This includes developing methods superior to the standard stochastic gradient for particular settings, using parallel hardware to speed up training in wall clock time, distributed and federated learning, and posterior sampling.
We develop algorithms for calculating the similarity of partially assembled data. Classical applications of those algorithms include phylogeny and other clustering techniques; however, the measure is transferable to other machine learning approaches as classification. Besides that, we are interested in applying relational and logic-based learning algorithms to biological estimation and prediction problems with hybrid (real and discrete) and structured data and expressive background knowledge.
In the application domain of predictive sports analytics, we benefit from our development of machine learning models and their interconnection with mathematical optimization tasks. Our core testbed application is exploiting betting markets with a proper combination of the two techniques, such as in the project of End-to-end learning of optimal portfolios. We further focus on creating novel predictive models for match outcomes across different sports domains (football, basketball, tennis), input statistics (scores, features, ratings), levels of granularity (individual, team), and game mechanics (standard, e-sports, fantasy sports).
We develop methods and software tools for understanding molecular biology data. We specialize in machine learning and statistical methods. One of our main goals is to develop interpretable and accurate models from datasets where the number of samples is much smaller that the number of features. One of the main issues is to avoid overfitting, we minimize it with the aid of prior knowledge available in bioinformatics databases (gene annotations, interaction databases, metabolic and signaling pathways etc.).
We develop and apply statistical methods to various biological problems. We specialize in omics data analysis and fusion, our typical goal is to interpret these data in terms of simplified and understandable statistical models. A simple example is differential gene expression analysis followed by enrichment analysis.
Learning theory studies why machine learning works. We study learning theory in the relational learning setting where many standard assumptions, such as the “i.i.d. assumption”, do not hold. This makes the studied problems more challenging but also more interesting. We also study how natural aspects of learning such as generalization of facts into rules in the presence of background knowledge can be modeled in formal logic frameworks. We contribute to the fields of inductive logic programming, statistical relational learning and related areas.
We develop algorithms that can learn from relational data (databases, graphs, networks, ontologies) accounting for both the structural and probabilistic aspects of the data. We also study theoretical aspects of relational learning problems such as their computational complexity and sample complexity. We contribute to the fields of inductive logic programming, statistical relational learning and related areas.
We study machine-learning algorithms that learn hypotheses and models represented through human-readable languages such as first-order logic. We also try to endow non-symbolic frameworks such as neural networks with symbolic elements to improve the interpretability of the learned models. We contribute to the fields of inductive logic programming, statistical relational learning and related areas.